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Abstract We study multivariate normal models that are described by linear constraints on the inverse of 
the covariance matrix. Maximum likelihood estimation for such models leads to the problem of maximizing 
the determinant function over a spectrahedron, and to the problem of characterizing the image of the 
positive definite cone under an arbitrary linear projection. These problems at the interface of statistics 
and optimization are here examined from the perspective of convex algebraic geometry. 
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1 Introduction 

Every positive definite m x m-matrix E is the covariance matrix of a multivariate normal distribution 
on M™. Its inverse matrix K = is also positive definite and known as the concentration matrix 

of the distribution. We study statistical models for multivariate normal distributions on R™, where the 
concentration matrix can be written as a linear combination 

K = XiKi + \2K2 + ■ ■ ■ + XdKd (1) 

of some fixed linearly independent symmetric matrices Ki, . . . , Kd- Here, Ai, A2, . . . , Ad are unknown real 
coefficients. It is assumed that K is positive definite for some choice of Ai, A2, . ■ . , Ad - Such statistical 
models, which we call linear concentration models, were introduced bv I AndersonI ( 19701 ). 

Let S™ denote the vector space of real symmetric m x m-matrices. We identify S™ with its dual 
space via the inner product {A, B) := tiace{A ■ B). The cone §™q of positive semidefinite matrices is a 
full-dimensional self-dual cone in §"'. Its interior is the open cone §™q of positive definite matrices. We 
define a linear concentration model to be any non-empty set of the form 

L-l := {Ee^^, : S-'eC], 
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where £ is a linear subspace of §™ . Given a basis Ki, . . . , Kd of the subspace £ as in ^ , the basic 
statistical problem is to estimate the parameters Ai, . . . , when n observations Xi, . . . , Xn are drawn 
from a multivariate normal distribution S), whose covariance matrix S = is in the model ^Cj^o- 
The n observations Xi and their mean X are summarized in the sample covariance matrix 

s = -y{x,^x){x,^xf e s^o- 

4=1 

In our model, we make no assumptions on the mean vector n and alwa ys use the sample mean X as 
estimate for ^. Thus, we are precisely in the situation of ( Drton et al.ll2009l . Prop. 2.1.12), with 6*2 = L~^- 



The log-likelihood function for the linear concentration model ([T|) equals 

/ d \ d 

\ogdei{K)-{S,K) = \ogdet(y2\,KA-Y,X,{S,K,) (2) 

times the constant n/2. This is a strictly concave function on the relatively open cone §™g n £. If a 
maximum (i.e. the maximum likelihood estimate or MLE) exists, then it is attained by a unique matrix 
K in §™Q n C. Its inverse S — K^^ is uniquely determined by the linear equations 

{S,Kj) = {S,Kj) for j = l,2,...,d. (3) 



This characterization follows from the statistical theory of exponential families ( Brown|[l986l §5). In that 



theory, the scalars Ai,...,A£i are the canonical parameters and (S, Ki), . . . , (S , Kj) are the sufficient 
statistics of the exponential family ^ . For a special case see (jPrton et al. I I2OO9I Theorem 2.1.14). 



We consider the set of all covariance matrices whose sufficient statistics are given by the matrix S. 
This set is a spectrahedron. It depends only on 5* and £, and it is denoted 

fibers (S*) = {Se §™o : K) = {S, K) for all K ^ C] . 

The MLE exists for a sample covariance matrix S if and only if fiber£(S') is non-empty. If rank(5) < m 
then i t can hap pen that the fiber is empty, in which case the MLE does not exist for {£,S). Work of 



Buhll (|l993l ) and lBarrett et al.l (|l993f ) addresses this issue for graphical models; see Section S] below. 

Our motivating statistical problem is to identify conditions on the pair (£, S) that ensure the existence 
of the MLE. This will involve studying the geometry of the semi-algebraic set C^q and of the algebraic 
function S S which takes a sample covariance matrix to its MLE in C^q. 

Example 1.1. We illustrate the concepts introduced so far by way of a small explicit example whose 
geometry is visualized in Fig. 1. Let m = d = 3 and let C be the real vector space spanned by 



/sTi = 1 1 , = 1 and K:^ = 






The linear concentration model ([T]) consists of all positive definite matrices of the form 

(Al + A2 + A3 A3 A2 \ 

A3 Al + A2 + A3 Al . (4) 

A2 Al A1 + A2 + A3/ 

Given a sample covariance matrix S = (sy ), the sufficient statistics are 

ti = trace(S') + 2s23 , t2 — trace(S') -I- 2si3 , ^3 = trace(S') + 2si2. 

If 5 G S^Q then fiber£(S') is an open 3-dimensional convex body whose boundary is a cubic surface. 
This is the spectrahedron shown on the left in Fig. 1. The MLE S is the unique matrix of maximum 



Multivariate Gaussians, Semidefinite Matrix Completion, and Convex Algebraic Geometry 



3 




o 



y 
) 

K 




Fig. 1 Three figures, taken from lNie et al ] l|2009t ). illustrate Example [TT] These fi gures show the spectrahedron fiber£(S) 
(left), a cross section of the spectrahedral cone K.c (middle), and a cross section of its dual cone Cc (right). 

determinant in the spectrahedron fiber£(S'). Here is an exphcit algebraic formula for the MLE = (sy ): 



= 240s;^3 + (-32ti - 32t2 - 192^3)5^3 + {-~%t\ + \Uxti + - %t\ + 16^2^3 + 32*^)5^3 

+ (8t? - %t\t2 - Shtl + 84)s33 - Atlts - 644 + + 4iit^ + Uitlt^ + Ult2h - 4 
-A4t3 + 4,44 - 8tit24 -4 + 4i?t2- 
Next, we read off §23 from 

-24 (4 - 2tit2 +4- 4) S23 = 1203^3 - (16ti + 16t2 + 36t3)s^3 + (2t^ - 4tit2 + 24 - 8^^)533 - 

+i84t2 + 4h - 18^1^2 - 2tit2t3 + mi4 + 64 + 4h - 2^2*3 - ■^4- 

Then we read off S22 from 

-24 (h - t2) S22 = 601^3 + (4ti - 20^2 - 24t3)s33 + 24(ti - t2 - ^3)523 

- 114 + 10tlt2 + 10^1^3 + 4- 2*2^3 - 4*3. 

Finally, we obtain the first row of S as follows: 

Sl3 = S23 - tl/2 + t2/2, S12 = 523-^1/2 + ^3/2, Sll = il - S33 - 2S23 - S22- 

The MLE IJ — [sij] is an algebraic function of degree 4 in the sufficient statistics (ii,t2,i3)- In short, 
the model @ has ML degree 4. We identify our model with the subvariety of projective space 
that is parametrized by this algebraic function. The ideal of polynomials vanishing on equals 

Pc = ( Sl3 - 523 - S11S33 + S22S33 , S12 - S11S22 - S23 + S22S33 , 



The domain of the maximum likelihood map (ii, ^2, ^3) 1^ i7 is the cone of sufficient statistics Cc in R'^. 
The polynomial He which vanishes on the boundary of this convex cone has degree six. It equals 

He = 4- ^4^2 + 1944 - 2844 + 1944 - 6ti4 +4- Qtlt-s + 14iti2i3 - 24:44*3 - 2^44*3 

+Uti4t3 - 64^-3 + 19*^*3 - 244^4 + 106^1^2*3 - 24ti44 + 19^2^3 - 2844 - 244t24 

-24ti44 - 28*2^3 + 19^1*3 + 14*1^24 + 1944 - 6^1*3 - ^^2^3 + 4- 

The sextic curve {He — 0} in is shown on the right in Fig. 1. It is dual to the cubic curve {det(/'ir) — 0}, 
shown in the middle of Fig. 1. The cone over the convex region enclosed by the red part of that cubic 
curve is the set JCc = §^0 H £ of concentration matrices in our model (H]). □ 




S12S13 - S13S22 - S11S23 + S12S23 + S13S23 + S23 - S12S33 - S22S33 , 



S11S13-S13S22-S11S23+S22S23-S11S33-2S12S33-S13S33-S22S33-S23S33+S33, 

SllSl2-SllS22-Sl2S22-2si3S22+S22-'5ll'523-S22S23-Sl2S33-S22S33 + S23S33, 
Sll - 2siiS22 - 4si3S22 + S22 - 4siiS23 - 2siiS33 - 4si2S33-2s22S33+S33 )• 



www. literka. addr . com/mathcountry/algebra/quartic .htm 
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This paper is organized as follows. In Section [5] we formally define the objects JCc, Cc, Pc and He, 
which already appeared in Example II. 1[ and we derive three guiding questions that constitute the main 
thread of this paper. These questions are answered for generic linear spaces C in Subsection 12.21 That 
subsection is written for algebraists, and readers from statistics or optimization can skip it at their first 
reading. In Section [3] we answer our three questions for diagonal concentration models, using results from 
geometric combinatorics. Section 3] deals with Gaussian graphical models, which are the most prominent 
linear concentration models. We resolve our three questions for chordal graphs, then for chordless cycles, 
and finally for wheels and all graphs with five or less vertices. We conclude this paper with a study 
of colored Gaussian graphical models in Section [3 These are special Gaussian graphical models with 
additional linear restrictions on the concentration matrix given by the graph coloring. 



2 Linear Sections, Projections and Duality 

Convex algebraic geometry is concerned with the geometry of real algebraic varieties and semi-algebraic 
sets that arise in convex optimization, especially in semidefinite programming. A fundamental problem 
is to study convex sets that arise as linear sections and projections of the cone of positive definite 
matrices S™o- Introduction, this problem arises naturally when studying maximum 

likelihood estimation in linear concentration models for Gaussian random variables. In particular, the 
issue of estimating a covariance matrix from the s ufficient sta tistics c an be seen as an extension of the 
familiar semidefinite matrix completion problem (B arrett et al. 1993; Grone et al. 1984[ l. In what follows, 



we develop an algebraic and geometric framework for systematically addressing such problems. 



2.1 Derivation of three guiding questions 

As before, we fix a linear subspace C in the real vector space S™ of symmetric m x m-matrices, and we 
fix a basis {Ki, . . . , K^} of C. The cone of concentration matrices is the relatively open cone 

iCc = cn s^o. 

We assume throughout that JCc is non-empty. Using the basis Ki, . . . , Kd of C, we can identify JCc with 

d 

= {(Al, . . . , Arf) £ R*^ : XjKj is positive definite }. (5) 

1=1 

This is a non-empty open convex cone in W^. The orthogonal complement of £ is a subspace of 
dimension (™^^) — d in S™, so that §''"/£^ ~ M'*, and we can consider the canonical map 

This is precisely the linear map which takes a sample covariance matrix S to its canonical sufficient 
statistics. The chosen basis of £ allows us to identify this map with 

t:c : S"^M^ S ^ {{S, K^) , . . . , {S, Kd)) . (6) 

We write Cc for the image of the positive-definite cone §"q under the map nc- We call Cc the cone of 
sufficient statistics. The following result explains the duality between the two red curves in Fig. 1. 

Proposition 2.1. The cone of sufficient statistics is the convex dual to the cone of concentration matrices. 
The basis-free version of this duality states 

Cc = {Se §"7/:^ : {S, K)>0 for all KeK-c). (7) 

The basis- dependent version of this duality, in terms of and states 

d 

Cc = {(ti,...,id) eK'' : ^tA»>0 for all{\i,...,\d)£lCc]- (8) 

i=l 
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Proof. Let /C£ denote the right-hand side of ([7]) and let M — (™^^) • Using the fact that the M-dimensional 
convex cone S™q is self-dual, general duality theory for convex cones implies 

To derive ([5]) from ([7]), we pick any basis C/i, . . . , Um of S™ whose first d elements serve as the dual 
basis to i^i, . . . , Kd^ and whose last ('"^^) — d elements span . Hence (t/j, Kj) = Sij for all i,j. Every 
matrix U in S™ has a unique representation U — "^i^itiUi, and its image under the map ([6]) equals 
7r£(C/) = For any matrix K = Yl'^^i^j^j ^ '^^ have {U,K) ~ X^iLi^j'^i' ^'^d this 

expression is positive for all K G ICc if and only if (ii, . . . , t^) lies in C^. □ 

It can be shown that both the cone K-c of concentration matrices and its dual, the cone Cc of sufficient 
statistics, are well-behaved in the following sense. Their topological closures are closed convex cones that 
are dual to each other, and they are obtained by respectively intersecting and projecting the closed cone 
§™Q of positive semidefinite matrices. In symbols, these closed semi-algebraic cones satisfy 

^ = £ n S^o and = ^d^'^o)- (9) 

One of our objectives will be to explore the geometry of their boundaries 

dJCc ^\JCc and dCc := C^Cc- 

These are convex algebraic hypersurfaces in R'', as seen in Example 11.11 The statistical theory of expo- 
nential families implies the following corollary concerning the geometry of their interiors: 

Corollary 2.2. The map K i-^ T = Tr£{K^^) is a homeomorphism between the dual pair of open cones 
ICc and Cc- The inverse map T t-^ K takes the sufficient statistics to the MLE of the concentration 
matrix. Here, is the unique maximizer of the determinant over the spectrahedron t^~c^{T) H §™o- 

One natural first step in studying this picture is to simplify it by passing to the complex numbers 
C. This allows us to relax various inequalities over the real numbers M and to work with varieties over 
the algebraic closed field C. We thus identify our model with its Zariski closure in the (('"^^) — 1)- 
dimensional complex projective space P(§'"). Let Pc denote the homogeneous prime ideal of all poly- 
nomials in M.[E] = M[sii, si2, . . . , Smm\ that vanish on . One way to compute Pc is to eliminate the 
entries of an indeterminate symmetric m x m-matrix K from the following system of equations: 

S-K = Id™ , X e £. (10) 

Given a sample covariance matrix its maximum likelihood estimate S can be computed algebraically, 
as in Example 11.11 We do this by solving the following zero-dimensional system of polynomial equations: 

S-K = Id,„ , A' e £ , S-S (^C^. (11) 

In the present paper we focus on the systems (ITO)) and (fTTj) . Specifically, for various classes of linear 
concentration models £, we seek to answer the following three guiding questions. Example 11.11 served to 
introduce these three questions. Many more examples will be featured throughout our discussion. 

Question 1. What can be said about the geometry of the (d— l)-dimensional projective variety C^^l 
What is the degree of this variety, and what are the minimal generators of its prime ideal P^? 

Question 2. The map taking a sample covariance matrix S to its m aximum likelihood estimate S is an 
algebraic function. Its degree is the ML degree of the model C. See (jPrton et al. I l2009l Def. 2.1.4). Can 



we find a formula for this ML degree? Which models C have their ML degree equal to 1? 

Question 3. The Zariski closure of the boundary dCc of the cone of sufficient statistics Cc is a hyper- 
surface in the complex projective space P''"^. What is the defining polynomial He of this hypersurface? 
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2.2 Generic linear concentration models 

In this subsection we examine the case when £ is a generic subspace of dimension d in §"'. Here "generic 
is understood in the sense of algebraic geometry. In terms of the model representation ([1]), t his means 
that t he matrices Ki , . . . , Kd were chosen at random. This is precisely the hypothesis made by iNie et al 
()2009l) . and one of our goals is to explain the connection of Questions 1-3 to that paper. 



To begin with, we establish the result that the two notions of degree coincide in the generic case. 

Theorem 2.3. The ML degree of the model (QP defined by a generic linear subspace C of dimension d 
in S™ equals the degree of the projective variety C^^ . That degree is denoted <p{m,d) and it satisfies 



(t>{m, d) — 0m, 



1 



l-d 



(12) 



We calculated the ML degree 0(m, d) of the generic model C for all matrix sizes up to m = 6: 



d 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 




1 


2 


4 


4 


2 


1 
















0(4,0?) 


1 


3 


9 


17 


21 


21 


17 


9 


3 


1 








0(5, d) 


1 


4 


16 


44 


86 


137 


188 


212 


188 


137 


86 


44 


16 ••• 


0(6, d) 


1 


5 


25 


90 


240 


528 


1016 


1696 


2396 


2886 


3054 


2886 


2396 ••• 



This table was computed with the software Macaulay^, using the commutative algebra techniques dis- 
cussed in the proof of Thcorem l2.3l At this point, readers from statistics are advised to skip the algebraic 
technicalities in the rest of this section and to go straight to Section 4 on graphical models. 

The last three entries in each row follow from Bezout's Theorem because Pc is a complete intersection 
when the codimension of C^^ in P(§™) is at most two. Using the duality relation (|12p . we conclude 



0(m, d) = (m — 1) 



for d = 1,2,3. 



When C^^ has codimension 3, it is the complete intersection defined by three generic linear combinations 
of the comaximal minors. From this complete intersection we must remove the variety of m x m-symmetric 
matrices of rank < m — 2, which has also codimension 3 and has degree (^^^^ Hence: 



0(to,4) = (to - 1)^ - 



1 



1 



= -(5m- 3)(m- l)(m - 2). 



W hen d is larger t han 4, this approach leads to a problem in residual intersection theory. A formula due 
to StiickradI (1992), rederived in recent work bv iChardin et al.. (200^) on this subject, imphes that 



l)(m - 2)(7to^ - 19m -t- 6). 



For any fixed dimension d, our ML degree 0(to, d) seems to be a polynomial function of degree d 
TO, but it gets progressively more challenging to compute explicit formulas for these polynomials. 



1 in 



Proof of Theorem \2.3[ Let / be the ideal in the polynomial ring R[Z'] = R[si i , si 2 , ■ ■ ■ , Smml t hat is 
generated by the (m— 1) x (to— l)-minors of the symmetric to xm- matrix S — (s^). iKotzevI (1991) 
proved that the Rees algebra TZ{I) of the ideal / is equal to the symmetric algebra of J. Identifying the 
generators of / with the entries of another symmetric matrix of unknowns K = (kij), we represent this 
Rees algebra as TZ{I) = R[i7, K]/ J where the ideal J is obtained by eliminating the unknown t from the 
matrix equation S ■ K = t ■ Idm- The presentation ideal J = {S ■ K — tldm) H K] is prime and it 
is homogeneous with respect to the natural N^-grading on the polynomial ring ]R[Z', K]. Its variety V{J) 
in p^^-i X p^^-i is the closure of the set of pairs of symmetric matrices that are inverse to each other. 
Here M — (™^^)- Both the dimension and the codimension of V{J) is equal to M — 1. 



2 www.math.uiuc.edu/Macaulay2/ 
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W e now make use of the notion of multidegree introduced in the text book of iMiller and Sturmfeld 
( 2004 ). Namely, we consider the bidegree of the Rees algebra TZ{I) — M.[S, K]/ J with respect to its N^- 



grading. This bidegree is a homogeneous polynomial in t wo variables x and y of degree M — 1. Using 
notation as in ( Miller and Sturmfelsll200l Def. 8.45) and (|Nie et al.ll2009[ Thm. 10), we claim that 



M 

C{n{I);x,y) = Y.^{m,d)x^'-''y''-\ (13) 

d=l 

Indeed, the coefficient of x^^^"^ y'^^^ in the expansion of C(TZ{I); x, y) equals the cardinality of the finite 
variety V{J)n {M x £) , where £ is a generic plane of dimension d—1 in the second factor P^^^ ^ and A4 is 
a generic plane of dimension M — d in the first factor P*^^^. We now take A4 to be the specific plane which 
is spanned by the image of and one extra generic point S, representing a random sample covariance 
matrix. Thus our finite variety is precisely the same as the one described by the affine equations in pip , 
and we conclude that its cardinality equals the ML degree 0(m, d). 

Note that V{J) n (P^^-i x C) can be identified with the variety V{Pc) in P^^^^. The argument in 
the previous paragraph relied on the fact that Pc is Cohen-Macaulay, which allowed us to chose any 
subspace A4 for our intersection count provided it is disjoint from V{Pc) in P*^~^. This proves that 
(t>( m, d) coincide s with the degree of V{Pc)- The Cohen-Macaulay property of P c follows from a result 
of iHerzQg et al. (1985) together with the aforementioned work of iKotzevI ( 1991 ) which shows that the 



ideal / has sliding depth. Finally, the duality (|T2l) is obvious for the coefficients of the bidegree ([T3|) of 
the Rees algebra TZ{I) since its presentation ideal J is symmetric under swapping K and 2J. □ 

We now come to our third question, which is to determine the Zariski closure V{Hc) of the boundary of 
the cone Cc = A^£- Let us assume now that C is any d-dimensional linear subspace of S™, not necessarily 
generic. The Zariski closure of dICc is the hypersurface {det{K) = 0} given by the vanishing of the 
determinant of K — ^i^i- This determinant is a polynomial of degree d in Ai, . . . , A^. Our task is 

to compute the dual varie ty in the sense of projective algebraic geometry of each irreducible component 
of this hypersurface. See ( Nie et al. 20091 §5) for basics on projective duality. We also need to compute 



the dual variety for its singular locus, and for the singular locus of the singular locus, etc. 

Each singularity stratum encountered along the way needs to be decomposed into irreducible com- 
ponents, whose duals need to be examined. If such a component has a real point that lies in dlCc and 
if its dual variety is a hypersurface then that hypersurface appears in He- How to run this procedure in 
practice is shown in Example l4.10l For now, we summarize the construction informally as follows. 

Proposition 2.4. Each irreducible hypersurface in the Zariski closure of dCc is the projectively dual 
variety to some irreducible component of the hypersurface {det{K) = 0}, or it is dual to some irreducible 
variety further down in the singularity stratification of the hypersurface {det(/'ir) = 0} C P'^^"'^. 

The singular stratification of {det{K) — 0} can be computed by applying primary decomposition to 
the ideal of pxp-minors of K for 1 < p < m. If J is any minimal prime of such a determinantal ideal 
then its dual variety is computed as follows. Let c = codim(/) and consider the Jacobian matrix of /. 
The rows of the Jacobian matrix are the derivatives of the generators of / with respect to the unknowns 
Al, . . . , Arf. Let J be the ideal generated by / and the cxc-minors of the matrix formed by augmenting 
the Jacobian matrix by the extra row (^i, ^2, • ■ • , t^). We saturate J by the cxc-minors of the Jacobian, 
and thereafter we compute the elimination ideal J D M[ti, t2, . . . , td]- If this elimination ideal is principal, 
we retain its generator. The desired polynomial He is the product of these principal generators, as / runs 
over all such minimal primes whose variety has a real point on the convex hypersurface dK-c- 

Proposition 12.41 is visualized also in Fig. |4] below. Let us now apply this result in the case when the 
subspace £ is generic of dimension d. The ideal of pxp-minors of K defines a subvariety o f P*^"^, which is 
irreducible whenever it is positive-dimensional (by Bertini's Theorem). It is known from ( Nie et al. 20091 
Prop. 5) that the dual variety to that determinantal variety is a hypersurface if and only if 

-r^)^.-i (o^cr)-^"'^ 

Assuming that these inequalities hold, the dual hypersurface is defined by an irreduci ble homogeneous 



Assummg tnat tnese inequalities noia, tne dual nypersurrace is dennea by an irreduci ble nomogeneous 
polynomial whose degree we denote by S{d— 1, m,p— 1). This notation is consistent with lNie et al.l ( 20091 ) 
where this number is called the algebraic degree of semidefinite programming (SDP). 
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Corollary 2.5. For a generic d- dimensional subspace C of S™, the polynomial He is the product of 
irreducible polynomials of degree S{d — 1, m,p — 1). That number is the algebraic degree of semidefinite 
programming. Here p runs over integers that satisfy and dJCc contains a matrix of rank p — 1. 



3 Diagonal Matrices, Matroids and Polytopes 



This section concerns the case when £ is a d-dimensional space consisting only of diagonal matrices in 
S™. Here, the set C'l of covariance matrices in the model also consists of diagonal matrices only, and 
we may restrict our considerations to the space M™ of diagonal matrices in S™. Thus, throughout this 
section, our ambient space is R"*, and we identify R"* with its dual vector space via the standard inner 
product (u, v) = "j^j- We fix any d x m-matrix A whose rows space equals C, and we assume that 

C n RyQ 7^ 0. We consider the induced projection of the open positive orthant 



Ax. 



(15) 



Since £ = rowspace(^) contains a strictly positive vector, the image of tt is a pointed polyhedral cone, 
namely Cc = pos(^) is the cone spanned by the columns of A. Each fiber of tt is a bounded convex 
polytope, and maximum likelihood estimation amounts to finding a distinguished point x in that fiber. 

The problem of characterizing the exist ence of the M LE in this situation amounts to a standard 
problem of geometric combinatorics (see e.g. IZiegler ( 1995[ )). namely, to computing the facet description 
of the convex polyhedral cone spanned by the columns of A. For a given vector t g M'' of sufficient 
statistics, the maximum likelihood estimate exists in this diagonal concentration model if and only if t 

lies in the interior of the cone pos(A). This happens if and only if all facet inequalities are strict for t. 

This situation is reminiscent of Birch's Theorem for toric models in algebraic statistics i Pachter and Sturmfeld 



2005L Theorem 1.10), and, indeed, the combinatorial set-up for deciding the existe nce of the MLE is identi- 
cal to that for toric models. For a statistical perspective seelEriksson et al. |i2006). However, the algebraic 
structure here is not that of toric models, describ ed in (jPachter and Sturmfels, 2005> §1.2.2), but that of 
the linear models in ( Pachter and Sturmfeldliool §1.2.1). 

Our model here is not toric but it is the coordinatewise reciprocal of an open polyhedral cone: 



C 



>o 



As in Section 2, we view its Zariski closure £ ^ as a subvariety in complex projective space: 

= {ueP"'^^ : u-^^{u^\u2\...,u~^)eC}. 
Maximum likelihood estimation means intersecting the variety £~^ with the fibers of tt. 
Example 3.1. Let m = 4, d = 2 and take £ to be the row space of the matrix 



A 



3 2 10 
12 3 



The corresponding statistical model consists of all multivariate normal distributions on R'' whose con- 
centration matrix has the diagonal form 



K 



/3Ai 0^ 

2Ai + Aa 
Al + 2A2 
\ 3A2y 



Our variety £ ^ is the curve in P'^ parametrized by the inverse diagonal matrices which we write as 
= diag(xi, X2, 0:3, 0:4). The prime ideal Pc of this curve is generated by three quadratic equations: 

X2X3 — 2x2X4 + X3X4 — 2x1X3 — 3xiX4 + X3X4 — X1X2 — 3xiX4 + 2x2X4 — 0. 

Consider any sample covariance matrix S — (sij), with sufficient statistics 



ti = 3sii + 2s22 + S33 > and t2 = S22 + 2S33 + 3544 > 0. 
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The MLE for these sufficient statistics is the unique positive solution x of the three quadratic equations 
above, together with the two hnear equations 



3x1 + 2x2 + 2^3 = ti and X2 + 2xj, + 3x4 



to 



We find that x is an algebraic function of degree 3 in the sufficient statistics (ii, 12), so the ML degree of 



the model K equals 3. This is consistent with formula (fTB|) below, since {2-1) ^ 



□ 



We now present the solutions to our three guiding problems for arbitrary c?-dimensional subspaces 
C of the space R™ of mxm-diagonal matrices. The degree of the projec tive variety and it s prim e 
ideal Pc are known from work of Terao, (,2002i) and its refinements due to Proudfoot and Soever ( 20061 ). 
Namely, the degree of equals the beta-invariant of the rank ra — d matroid on [to] = |1, 2, . . . , to} 
associated with L. We denote this beta-invariant by /?(£). For matroid basics see I Whita (1992). 

The beta-invariant P{C) i s known to equal the number of bounded regions in the (to — ci)-dimensional 
hyperplane arrangement (cf. IZaslavskvl ( 19751 )) obtained by intersecting the affine space u + £^ with 
the TO coordinate hyperplanes {xi = 0}. Here u can be any generic vector in M™o- One of these regions, 
namely the one containing u, is precisely the fiber of tt. If £ is a generic d-dimensional linear subspace of 
R™, meaning that the above matroid is the uniform matroid, then the beta-invariant equals 



TO — 1 



(16) 



For non-generic subspaces £, this binomial coefficient is always an upper bound for 13(C). 

Theorem 3.2. ( Proudfoot and Soever ( 20061 ): Teraol ( 20021 )') The degree of the projective variety L~ 
equals the beta-invariant (3{C). Its prime ideal Pc is generated by the homogeneous polynomials 



E « n 



(17) 



2€:SUpp(u) 



where v runs over all non-zero vectors of minimal support in . 

For experts in combinatorial commutative algebra, we note that lProudfoot and Soever ( 20061 ) actually 
prove the following stronger results. The homogeneous polynomials (jl7p form a universal Grobner basis 
of Pc- The initial monomial ideal of Pc with respect to any term order is the Stanley- Reisner ideal 
of the corresponding broken circuit complex of the matroid of C. Hence the Hilbert series of Pc is the 
rational function obtained by dividing the h-polynomial h{t) of the broken circui t complex by (1 — t^. 
In particular, the degree of Pc is the number h{V) = (i{C) of broken circuit bases ( White 19921 §7). 

We next consider Question 2 in the diagonal case. The maximum likelihood map takes each vector t 
in the cone of sufficient statistics Cc = pos(j4) to a point of its fiber, namely: 



X = argmax|y~^ log(a;i) : a; € M™q and Ax = i}. 

=1 



(18) 



This is the unique point in the polytope -K^^it) = {x e R™q and Ax — t} which maximizes the product 
X1X2 ■ ■ • x„ of the coordinates. It is also the unique point in ■K^^{t) that lies in the reciprocal linear variety 
. In the linear programming literature, the point x is known as the analytic center of the polytope 
7r~^(t). In Section 2 we discussed the extension of this concept from linear programming to semidefinite 
programming: the analytic center of a spectrahedron is the unique point S at which t he determinant 
function attains its maximum. For an applied perspective see IVandenberghe et al. I (Il996l) . 

For any linear subspace C in S™, the algebraic degree of the maximum likelihood map t^Sis always 
less than or equal to the degree of the projective variety . We saw in Theorem 12.31 that these degrees 
are equal for generic C. We next show that the same conclusion holds for diagonal subspaces C 



Corollary 3.3. The ML degree of any diagonal linear concentration model £ C I 
beta-invariant l3{C) of the corresponding matroid of rank m — d on {1, 2, . . . , to}. 



C S™ is equal to the 
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Proof. The beta- invariant (3{C) counts the bounded regions in the arrangement of hyperplanes arising 
fro m the given facet description of the polytope Tr~^{t). Varchenko's Formula for hnear models, derived 
in i Pachter and Sturmfeld [ioosl Theorem 1.5), states that the optimization problem (fTSjl has precisely 



one real critical point in each bounded region, and that there are no other complex critical points. □ 

A fundamental question regarding the ML degree of any class of algebraic statistical models is to 
characterize those models which have ML degree one. The se are the models whose maximum likelihood 
estimator is a rational function in the sufficient statistics ( Drton et al.ll2009l §2.1). In the context here, 



we have the following characterization of matroids whose beta-invariant P{C) equals one. 

Corollary 3.4. The ML degree f3{C) of a diagonal linear concentration model C is equal to one if and 
only if the matroid of C is the graphic matroid of a series-parallel graph. 



Proof. The equivalence of series-parallel and 13 — 1 first appeared in (|Brvlawskilll97ll . Theorem 7.6). □ 



We now come to Question 3 which concerns the duality of convex cones in Proposition 12.11 In the 
diagonal case, the geometric view on this duality is as follows. The cone of sufficient statistics equals 
Cc — pos(A) and its convex dual is the cone K-c = rowspace(^) n C. Both cones are convex, polyhedral, 
pointed, and have dimension d. By passing to their cross sections with suitable affine hyperplanes, we 
can regard the two cones Cc and K,c as a dual pair of [d — l)-dimensional convex polytopes. 

The hypersurface {det(i^) = 0} is a union of ni hyperplanes. The strata in its singularity stratification, 
discussed towards the end of Section 2, correspond to the various faces F of the polytope JCc- The dual 
variety to a face F is the complementary face of the dual polytope Cc , and hence the codimension of that 
dual variety equals one if and only if _F is a vertex (= 0-dimensional face) of K-c- This confirms that the 
polynomial He sought in Question 3 is the product of all facet- definining linear forms of Cc- 

CoroUarv 12.21 furnishes a homeomorphism u i— > Au~^ from the interior of the polytope K-c onto 
the interior of its dual polytope Cc- The inverse to the rational function u i-^ Au^^ is an algebraic 
function whose degree is the beta-invariant (3{C). This homeomorphism is the natural generalization, 
from simplices to arbitrary polytopes, of the classical Cremona transformation of projective geometry. 
We close this section with a nice 3-dimensional example which illustrates this homeomorphism. 

Example 3.5 (How to morph a cube into an octahedron). Fix m = 8, d = 4, and C the row space of 

/ 1 -1 \ 
1-10 
1 -1 
\ 1 11 11 1 / 

We identify the cone JCc — rowspace(^) n M§,o with {A e : \ ■ A > {)}. This is the cone over the 
3-cube, which is obtained by setting A4 = 1. The dual cone Cc — pos(A) is spanned by the six columns 
of the matrix A. It is the cone over the octahedron, which is obtained by setting t4 = 1. 

We write the homeomorphism u i-^ Au"^ between these two four-dimensional cones in terms of the 
coordinates of A and t. Explicitly, the equation t = A ■ (XA)^^ translates into the scalar equations: 



A = 



tl 




1 1 




A4 


+ Ai A4 — Ai 








1 1 




A4 


+ A2 A4 — A2 ' 




ts 




1 1 




A4 


+ A3 A4 — A3 




U 




1 1 


1 


A4 


— ^ + 1 ^ ^ 

-f Ai A4 — Ai 


- T ^ H 

A4 + A2 


= 1 


we 


get the bijection 


(Ai, A2, A3 



111 



A4 — A2 A4 + A3 A4 — A3 

Substituting A4 = 1, we get the bijection (Ai,A2,A3) (ii/i4,t2/^4,^3/i4) between the open cube 
(—1,-1-1)'^ and the open octahedron {t e M'^ : \ti \ + \t2\ + \tz\ < 1}. The inverse map t 1-^ X is an 
algebraic function of degree (3{£) — 7. That the ML degree of this model is 7 can be seen as follows. 
The fibers 7r~^(t) are the convex polygons which can be obtained from a regular hexagon by parallel 
displacement of its six edges. The corresponding arrangement of six lines has 7 bounded regions. □ 
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4 Gaussian Graphical Models 



An undirected Gaussian graphical model arises when the subspace £ of S™ is defined by the vanishing 
of some off-diagonal entries of the concentration matrix K. We fix a graph G = {[m],E) with vertex 
set [m] = {1,2,..., to} and whose edge set E is assumed to contain all self-loops. A basis for £ is the 
set {Kij I (i,j) G E} of matrices Kij with a single 1-entry in position and 0-entries in all other 

positions. We shall use the notation ICg,Cg,Pg for the objects JCc,Cc, Pc, respectively. Given a sample 
covariance matrix S, the set fiberG(5) consists of all positive definite matrices 2J <E S"o with 



5. 



for all (i, j) e E. 



The cone of concentrati on matri c es ICg is important for semidefinite matrix completion problems. Its 
closure was denoted Vg bv lLaurend (|2001allbh . The dual cone Cg consists of all partial matrices T G K 
with entries in positions (i, j) G E, which can be extended to a full positive definite matrix. So, maximum 
likelihood estimation in Gaussian graphical models corresponds to the classical positive definit e matrix 
completion problem (jBarrett et al.l 119931 : iGrone et al.l 11984 iBarrett et al. I ll996t lLaure"^l2001tJ ). In this 
section we investigate our three guiding questions, first for chordal graphs, next for the chordless TO-cycle 
Cm, then for all graphs with five or less vertices, and finally for the m-wheel Wm- 



4.1 Chordal graphs 

A graph G is chordal (or decomposable) if every induced TO-cycle in G for to > 4 has a chord. A theorem 
due to Grone et al. ( 19841 ) fully resolves Question 3 when G is chordal. Namely, a partial matrix T G 



lies in the cone Cg if and only if all principal minors Tec indexed by cliques C in G are positive definite. 
The "only if" direction in this statement is true for all graphs G, but the "if" direction holds only when 
G is chordal. This result is equivalent to the characterization of chordal graphs as thos e that have sparsitu 
ord er equal to one, i .e., all extreme rays of ICg are matrices of rank one. We refer to Agler et"al] (198^ 



and iLaurentl (|2001a^ for details. From this characterization of chordal graphs in terms of sparsity order, 
we infer the following description of the Zariski closure of the boundary of Co- 

Proposition 4.1. For a chordal graph G, the defining polynomial Hq of dCa is equal to 

Hg = l[ det(Tcc)- 

C maximal 
clique of G 

We now turn to Question 2 regarding the ML degree of a Gaussian graphical model G. This number 
is here simply denoted by ML-degree(G). Every chordal graph is a clique sum of complete graphs. We 
shall prove that the ML degree is multiplicative with respective to taking clique sums. 

Lemma 4.2. Let G he a clique sum of n graphs Gi, . . . , G„. Then the following equality holds: 

n 

ML-degree{G) = ]^ ML-de^ree (Gi)- 

i=l 

Proof. We first prove this statement for n = 2. Let G be a graph which can be decomposed in disjoint 
subsets {A, B, G) of the vertex set V , such that G is a clique and separates A from B. Let G[n/] denote 
the induced subgraph on a vertex subset W C V. So, we wish to prove: 

ML-degree(G) = ML-degree (G[^uc]) ' ML-degree (G[buc])- (19) 

Given a generic matrix S G S™, we fix S Q S™ with entries Sij = Sij for {i,j) G E and unknowns 
Sij = Zij for (i, j) ^ E. The ML degree of G is the number of complex solutions to the equations 

(Z'-i)y = for all {i,i) (f. E. (20) 
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Let K = and denote by = (Z'j^yc'])^^ (respectively, — {U^qijq])^^) the inverse of the 

submatrix of 2J corresponding to the induced subgraph on A U C (respectively, B U C). Using Schur 
complements, we can see that these matrices are related by the following block structure: 

This block structure reveals that, when solving the system ((20l) . one can solve for the variables Zij 
corresponding to missing edges in the subgraph A U C independently from the variables over B U C and 
AU B. This implies the equation Induction yields the theorem for n > 3. □ 

The following theorem character izes chordal grap hs in terms of their ML degree. It extends the 
equivalence of parts (iii) and (iv) in ( Drton et al. 20091 Thm. 3.3.5) from discrete to Gaussian models. 



Theorem 4.3. A graph G is chordal if and only if ML-degree{G) = 1. 



Proof. The if-direction follows from Lemma 14.21 since every chordal graph is a clique sum of complete 
graphs, and a complete graph trivially has ML degree one. For the only-if direction suppose that G is 
a graph that is not chordal. Then G contains the chordless cycle Gm as an induced subgraph for some 
m > 4. It is easy to see that the ML degree of any graph is bounded below by that of any induced 
subgraph. Hence what we must prove is that the chordless cycle Cm has strictly positive ML degree. This 
is precisely the content of Lemma 14.71 below. □ 



We now come to Question 1 which concerns the homogeneous prime ideal Pq that defines the Gaussian 
graphical model as a subvariety of P(S™). Fix a symmetric rnxm-matrix of unknowns S — (sy) and let 
Sij denote the comaximal minor obtained by deleting the ith row and the jth column from S. We shall 
define several ideals in M.[S] that approximate Pg- The first is the saturation 

P^ = ((det(r,,) I (z,j) EE): {det{S)r). (21) 

This ideal is contained in the desired prime ideal, i.e. Pq C Pq. The two ideals have the same radical, 
but it might happen that they are not equal. One disadvantage of the ideal Pq is that the saturation 
step ([2T|) is computationally expensive and terminates only for very small graphs. 

A natural question is whether the prime ideal Pq can be constructed easily from the prime ideals Pqi 
and Pg2 when G is a clique sum of two smaller graphs Gi and G2 ■ As in the proof of Lemma 14.21 we 
partition [to] = A(J B(JC, where Gi is the induced subgraph on AuG, and G2 is the induced subgraph 
on B U C. If \C'\ — c then we say that G is a c-clique sum of Gi and G2. 

The following ideal is contained in Pq and defines the same algebraic variety in the open cone §"q: 

Pg, + Pg2 + { (c+l)x(c+l)-minors of Sauc.buc )• (22) 



One might guess that (|22p is equal to Pg, at least up to radical, but this fails for c > 2. Indeed we shall 
see in Example 14. 51 that the variety of can have extraneous components on the boundary S™g\S™g of 
the semidefinite cone. We do conjecture, however, that this equality holds for c < 1. This is easy to prove 
for c = when G is disconnected and is the disjoint union of Gi and G2. The case c = 1 is considerably 
more delicate. At present, we do not have a proof that ([H]) is prime for c = 1, but we believe that even 
a lexicographic Grobner basis for Pg can be built by taking the union of such Grobner basis for Pg, and 
Pg2 with the 2 x 2-minors of Sauc,buc- This conjecture would imply the following. 

Conjecture 4.4. The prime ideal Pg of an undirected Gaussian graphical model is generated in degree 
< 2 if and only if each connected component of the graph G is a 1-clique sum of complete graphs. In this 
case, Pg has a Grobner basis consisting of entries of S and 2 x 2-minors of S. 

This conjecture is an extension of the results and conjecture s for (directed) trees in ( Sullivan^ 20081 



55). Formulas for the degree of Pg when G is a tree are found in ( Sullivan^ 2008 . Corollaries 5.5 and 5.6) 



The "only if" direction in the first sentence of Conjecture 14.41 can be shown as follows. If G is not chordal 
then it contains an m-cycle (to > 4) as an induced subgraph, and, this gives rise to cubic generators for 
Pg, as seen in Subsection 4.2 below. If G is chordal but is not a 1-clique sum of complete graphs, then its 
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decomposition involves a c-clique sum for some c > 2, and the right hand side of (|22p contributes a minor 
of size c + 1 > 3 to the minimal generators of Pq- The algebraic structure of chordal graphical models is 
more delicate in the Gaussian case then in the discrete case, and t here is no Gaussian analogue to the 
characterizations of chordality in (i) and (ii) of ( Drton et al. 20091 Theorem 3.3.5). This is highlighted 



by the following example which was suggested to us by Seth Sullivant. 

Example 4.5. Let G be the graph on m = 7 vertices consisting of the triangles {i, 6, 7} for i = 1,2,3,4,5. 
Then G is chordal because it is the 2-clique sum of these five triangles. The ideal Pq is minimally generated 
by 105 cubics and one quintic. The cubics are spanned by the 3 x 3-minors of the matrices Sauc,buC 
where C — {6,7} and {A,B} runs over all unordered partitions of {1,2,3,4,5}. These minors do not 
suffice to define the variety V{Pc) set-theoretically. For instance, they vanish whenever the last two rows 
and columns of S are zero. The additional quintic generator of Pq equals 

S12S13S24S35S45 - S12S13S25S34S45 - S12S14S23S35S45 + S12S14S25S34S35 
+S12S15S23S34S45 - S12S15S24S34S35 + S13S14S23S25S45 - S13S14S24S25S35 

-S13S15S23S24S45 + S13S15S24S25S34 + S14S15S23S24S35 - S14S15S23S25S34- 



This polynomial is the pentad which is relevant for factor analysis ( Drton et al. 20091 Example 4.2.8). □ 



Given an undirected graph G on [m], we define its Sullivant- Talaska ideal STq to be the ideal in R[S] 
that is generated by the following collection of minors of S. For any submatrix Sa,b we include in STg 
all c X c-minors of Sa,b provided c is the smallest cardinality of a set C of vertices that separates A from 
B in G. Here, A, B and G need not be disjoi nt, and separation means tha t any path from a node in A to 



a node in B must pass through a node in C. ISullivant and Talaskal (|2008f ) showed that the generators of 



ST(3 are precisely those subdeterminants of S that lie in Pq, and both ideals cut out the same variety 
in the positive definite cone §™o- However, generally their varieties differ on the boundary of that cone, 
even for chordal graphs G, as seen in Example 14.51 In our experiments, we found that STg can often be 
computed quite fast, and it frequently coincides with the desired prime ideal Pq. 



4.2 The chordless m-cycle 



We next discuss Questions 1, 2, and 3 for the simplest non-chordal graph, namely, the m-cycle Gm- 
Its Sullivant- Talaska ideal STp^ is generated by the 3 x 3-minors of the submatrices Sa.b where A = 
{i, . . . , j— 1, i}, B = {j, . . . , i}, and \i — j\ > 2. Here {A, B} runs over all diagonals in the 
rn-gon, and indices are understood modulo m. We conjecture that 



We computed the ideal Pc^ in SingulaiH for small m. The following table lists the results: 



(23) 



m 


3 


4 


5 


6 


7 


8 


dimension d 


6 


8 


10 


12 


14 


16 


degree 


1 


9 


57 


312 


1578 


7599 


ML-degree 


1 


5 


17 


49 


129 


321 


minimal generators (degreemumber) 





3:2 


3:15 


3:63 


3:196 


3:504 



In all cases in this table, the minimal generators consist of cubics only, which is consistent with the 
conjecture (|23p . For the degree of the Gaussian m-cycle we conjecture the following formula. 

Conjecture 4.6. The degree of the projective variety V{Pc^) associated with the m-cycle equals 

2/2m\ _3.22m-3^ 
m 



^ www.slngular.uni-kl.de/ 
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Regarding Question 2, the following formula was conjectured in (iDrton et al.ll2009L §7.4): 

ML-degree(C™) = (m - 3) • 2"'-^ + 1, for m > 3. (24) 

This quantity is an algebraic complexity measure for the following matrix completion problem. Given 
real numbers Xi between —1 and +1, fill up the partially specified symmetric mxm-matrix 



? ? 



? ? 



/ 1 XI 

Xi 1 X2 7 ■ ■■ 

7 X2 1 X3 ? 

? ? X3 1 X4 



? ? ? ? r o 1 T 1 
X-^m ... . ^ / 



(25) 



to make it positive definite. We seek the unique fill-up that maximizes the determinant. The solu- 
tion to this convex optimization problem is an algebraic function of xi,X2, ■ ■ ■ ,Xm whose degree equals 
ML-degree(Cm). We do not know how to prove (|24p for to > 9. Even the following lemma is not easy. 

Lemma 4.7. The ML-degree of the cycle Cm is strictly larger than 1 /or m > 4. 

Sketch of Proof. We consider the special case of ([25)) when all of the parameters are equal: 



Xl 



X2 = ■ ■ ■ = x„ 



(26) 



Since the logarithm of the determinant is a concave function, the solution to our optimization problem is 
fixed under the symmetric group of the TO-gon, i.e., it is a symmetric circulant matrix Sm- Hence there 
are only [^^^^^J distinct values for the question marks in (P5|) . one for each of the symmetry class of long 
diagonals in the TO-gon. We denote these unknowns by si, S2, 
i-ih. circular off-diagonal. For instance, for to 
si and S2, and it looks like this: 



'1^1 



where s, is the unknown on the 



7, the circulant matrix we seek has two unknown entries 



/ 1 X Si S2 S2 si x\ 
X 1 X Si S2 S2 Si 

51 X 1 X Si S2 S2 

52 Si X 1 a; Si S2 

S2 S2 Si X 1 X Si 
Si S2 S2 Si X 1 X 
\x Si S2 S2 Si X 1 J 



The key observation is that the determinant of the circular symmetric matrix S„i factors into a product 
of m linear factors with real coefficients, one for each mth root of unity. For example, 

det(Z'7) = Y[ {i + {w + w^) ■ X + {w"^ +w'^) ■ si + {w^ +w^) ■ S2). 



Thus, for fixed x, our problem is to maximize a product of linear forms. By analyzing the critical equa- 
tions, obtained by taking logarithmic derivatives of det(Z'm), we can show that the optimal solution 
(si, S2, . . . , S|^ TO-2 j ) is not a rational function in x. For example, when m = 7, the solution (si, §2) is an 
algebraic function of degree 3 in x. Its explicit representation is 



si = 



x^ + S2X — §1 — §2 
1-x 



and S2 + (1 ^ 2x)sl + (-x^ + x- l)s2 + x^ = 



A detailed proof, for arbitrary to, will appear in the PhD dissertation of the second author. 



□ 
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We now come to our third problem, namely to giving an algebraic description of the cone of sufficient 
statistics, denoted Cm '■= Cc,,^- This is a full-dimensional open convex cone in R^™. The coordinates on 
M^"* are Sn, S22, ■ ■ • , Smm and = S12, 2:2 = S23, ■ • ■ , a^m = s„a. We consider 

C'ra Cyn H {sn = S22 = • • • = Sram = l}- 

This is a full-dimensional open bounded spectrahedron in R™. It consists of all {xi, . . . ,Xm) such that 
([25|) can be filled up to a positive definite matrix. The 2 x 2-minors of (l25|) imply that lies in the cube 
(— 1,1)™ = {\xi\ < 1}. The issue is to identify further constraints. We note that any description of the 
TO-dimensional spectrahedron leads to a description of the 2TO-dimensional cone Cm because a vector 
s € R^"* lies in Cm if and only if the vector a; S R™ with the following coordinates lies in C^: 

Xi = fori = l,2,...,m (27) 

Barrett et all (|l993h gave a beautiful polyhedral description of the spectrahedron C'm- The idea is to 



replace each Xi by its arc-cosine, that is, to substitute Xi = cos(0i) into Remarkably, the image of 
the spectrahedron C'm under this transformation is a convex polytope. Explicit linea r inequ alities in the 
angle coordinates describing the facets of this polytope are given in iBarrett et al. 



To answer Question 3, we take the cosine- image of any of these facets and compute its Zariski clo- 
sure. This leads to the following trigonometry problem. Determine the unique (up to scaling) irreducible 
polynomial /7„ which is obtained by rationalizing the equation 



2 



xi — COS > arccos(a;i) ) . (28) 



We call r'^j the m-th cycle polynomial. Interestingly, is invariant under all permutations of the m 
variables xi, X2, ■ ■ ■ , Xm- We also define the homogeneous m-th cycle polynomial Fm to be the numerator 
of the image of F^^ under the substitution (P7|) . The first cycle polynomials arise for m — 3: 

(1 xi xaX / Sn S12 si3^ 

xi 1 X2 and F3 = det S12 S22 S23 

X3 X2 1 / \si3 S23 S33 , 



The polyhedral characterization of Cm given in lBarrett et al. translates into the following theorem. 



Theorem 4.8. The Zariski closure of the boundary of the cone Cm, m > A, is defined by the polynomial 

Hc,„{sij) = Fm{sij) ■ (siiS22 - 5^2) ■ (S22S33 - S23) • ■ ■ {SmmSll ~ sj^) . 

To compute the cycle polynomial F^^^, we iteratively apply the sum formula for the cosine, 

cos(a + 6) = cos(a) • cos(6) — sin(a) • sin(6), 
and we then use the following relation to write (|28p as an algebraic expression in xi, . . . , a;„: 

sin(arccos(a;i)) = y^l — . 

Finally, we eliminate the square roots (e.g. by using resultants) to get the polynomial T^. 
For example, the cycle polynomial for the square (m = 4) has degree 6 and has 19 terms: 

F^ ~ A ^ xfx'^jx'l. — 'ixiX2X3X,^^xf + ^a;^ — 2^a;^x| -I- 8x1X2X32:4. 

i<j<k i i i<j 

By substituting (j27p into this expression and taking the numerator, we obtain the homogeneous cycle 
polynomial F4 which has degree 8. Here is a table summarizing what we know about the expansions of 
these cycle polynomials. Note that F^ and Fm have different degrees but the same number of terms. 



m 


3 


4 


5 


6 


7 


8 


9 


10 


11 


degree(r^) 


3 


6 


15 


30 


70 


140 


315 


630 


1260 


degree(rm) 


3 


8 


20 


48 


112 


256 


576 


1280 


2816 


#of terms 


5 


19 


339 


19449 


? 


7 


? 


? 


? 
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Table 1 Our three guiding questions for all non-chordal graphs with m < 5 vertices. Column 4 reports the degrees of 
the minimal generators together with the number of occurrence (degreemumber). The last column lists the degrees of the 
irreducible factors of the polynomial Hq that defines the Zariski closure of the boundary of Cq ■ For each factor we report 
in lowercase the rank of the concentration matrices defining its dual irreducible component in the boundary of Kg- 



Graph G 



dim d 



deg Pa 



mingens Pq 



ML-deg 



deg Hg 






10 



10 



11 



11 



12 



13 



57 



30 



31 



56 



24 



16 



3:2 



3:15 



2:6, 3:4 



3:10 



3:7, 4:1 



3:4, 4:1 



4:2 



17 



4 -21 



5 ■ 2i + 2O3 



5-21+82 



3 ■ 2i + 3i + 82 



■21+3-82 



2 ■ 2i + 2 ■ 3i + IO2 



4- 3i + 122 



The degree of the m-th cycle polynomial F^^ grows roughly like 2™, but we do not know an exact formula. 
However, for the homogeneous cycle polynomial we predict the following behavior. 

Conjecture 4.9. The degree of the homogeneous m-th cycle polynomial r„i equals m ■ 2™~^. 

There is another way of defining and computing the cycle polynomial F^, without any reference to 
trigonometry or semidefinite programming. Consider the prime ideal generated by the 3 x 3-minors of 
the generic symmetric mxm- matrix S = (sij). Then (Fm) is the principal ideal obtained by eliminating 
all unknowns Sij with |i — j| > 2. Thus, geometrically, vanishing of the homogeneous polynomial F^ 
characterizes partial matrices on the m-cycle Cm that can be completed to a matrix of rank < 2. Similarly, 
vanishing of F'^ characterizes partial m atrices (1231) tha t can be completed to rank < 2. 

Independently of the work of Barrett et al.l a solution to the problem of characterizing the 

cone Cm appeared in the same year in the statistics literature, namely bv iBuhl ( 1993| ). For statisticians, 
the cone Cm is the set of partial sample covariance matrices on the m-cycle for which the MLE exists. 



4.3 Small graphs, suspensions and wheels 

We next examine Questions 1, 2 and 3 for all graphs with at most five vertices. In this analysis we can 
restrict ourselves to connected graphs only. Indeed, if G is the disjoint union of two graphs Gi and G2 
then the prime ideal Pq is obtained from Pd and Pg2 as in ([2^ with c = 0, the ML-degrees multiply 
by Lemma 1121 and the two dual cones both decompose as direct products: 

Cg = Cgi X Cg2 and ICg = ICgi x /Cgj ■ 

Chordal graphs were dealt with in SectionUTj We now consider connected non-chordal graphs with m < 5 
vertices. There are seven such graphs, and in Table [1] we summarize our findings for these seven graphs. 
In the first two rows of Table [T] we find the 4-cycle and the 5-cycle which were discussed in Subsection 
14.21 As an illustration we examine in detail the graph in the second-to-last row of Table [T] 
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Fig. 2 A Gaussian graphical model on five vertices and seven edges having dimension d = 12. 

Example 4.10. The graph in Fig. [5] defines the Gaussian graphical model with concentration matrix 





As 





Ag 


Aio\ 


As 


A2 


At 





All 





At 


A3 


As 


A12 


Ag 





As 


A4 





VAio 


All 


A12 








We wish to describe the boundary of the cone Cg by identifying the irreducible factors in its defining 
polynomial Hq- We first use the Matlab software CV}^, which is specialized in convex optimization, to 
find the ranks of all concentration matrices K that are extreme rays in the boundary of /Cg- Using CVX, 
we maximize random linear functions over the compact spectrahedron Kg H {trace(-fC) = 1}, and we 
record the ranks of the optimal matrices. We found the possible matrix ranks to be 1 and 2, which agrees 
with the constraints 2 < p < 3 seen in ([T4| for generic subspaces C with m — 5 and d = 12. 

We next ran the software Singular to compute the minimal primes of the ideals of p x p-minors of K 
for p ~ 2 and p = 3, and thereafter we computed their dual ideals in M.[ti,t2, ■ ■ ■ ,ti2] using Macaulay2. 
The latter step was done using the procedure with Jacobian matrices described in Subsection 12.21 We 
only retained dual ideals that are principal. Their generators are the candidates for factors of Hg- 

The variety of rank one matrices K has four irreducible components. Two of those components cor- 
respond to the edges (3,4) and (1,4) in Fig.[2l Their dual ideals are generated by the quadrics 

pi = it^ti ~ t\ and p2 = 4tit4 - tg. 

The other two irreducible components of the variety of rank one concentration matrices correspond to 
the 3-cycles (1, 2, 5) and (2, 3, 5) in the graph. Their dual ideals are generated by the cubics 

2 2 2 2 2 2 

P3 = ^tit2h - ^5^6 ~ ^2^10 + ^6^10^11 - ^i^n and Pi = U2hh - tsh - ht^^ + tTiiiii2 - ^2^12- 

The variety of rank two matrices K has two irreducible components. One corresponds to the chordless 
4-cycle (1,2,3,4) in the graph and its dual ideal is generated by p^ — A, which is of degree 8. The 
other component consists of rank two matrices K for which rows 2 and 5 are linearly dependent. The 
polynomial pq that defines the dual ideal consists of 175 terms and has degree 10. 

The polynomial Hg is the product of those principal generators pi whose hypersurface meets OCg- 
We again used CVX to check which of the six components actually contribute extreme rays in dK.G- We 
found that only one of the six components to be missing, namely that corresponding to the chordless 
4-cycle (1, 2, 3, 4). This means that is not a factor of iJ^, and we conclude 

Hg - P1P2P3P4P6 and deg{HG) = 2 • 2i 2 • 3i + IO2. (29) 

Concerning Question 1 we note that the ideal Pg is minimally generated by the four 3x3-minors of 
-^1235,134 the determinant of -£'1245, 2345, and for Question 2 we note that the ML degree is five because 
the MLE can be derived from the MLE of the 4-cycle obtained by contracting the edge (2, 5). □ 

The graph in the last row of Table [1] is the wheel W4. It is obtained from the cycle C4 in the first row 
by connecting all four vertices to a new fifth vertex. We see in Table [T] that the ML degree 5 is the same 
for both graphs, the two cubic generators of correspond to the two quartic generators of Pw^, and 
there is a similar correspondence between the irreducible factors of the dual polynomials iJc4 and Hw^ ■ 
In the remainder of this section we shall offer an explanation for these observations. 



* www.staiiford.edu/~boyd/cvx/ 
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Let G = {y,E) be an undirected graph and G* = {V*^E*) its suspension graph with an additional 
completely connected vertex 0. The graph G* has vertex set V* — VU{0} and edge set E* ~ E \J {(0, v) \ 
V S V}. The ni-wheel is the suspension graph of the m-cycle C„i; in symbols, Wm — {Cm)* . We shall 
compare the Gaussian graphical models for the graph G and its suspension graph G* . 

Theorem 4.11. The ML degree of a Gaussian graphical model with underlying graph G equals the ML 
degree of a Gaussian graphical model whose underlying graph is the suspension graph G* . 

Proof. Let V ~ [m] and let 5** G S™o^^ be a sample covariance matrix on G* , where the first row and 
column correspond to the additional vertex 0. We denote by S" the lower right m x m submatrix of S* 
corresponding to the vertex set V and by S the Schur complement of S* at i^qq: 

S := S — -^{Sqi, . . . , Sq^)'^ (Sqi, ■ ■ ■ , S^m)- (30) 

Then S G §™o ^ sample covariance matrix on G. Let S be the MLE for S on the graph G. We claim 
that the MLE S* for S* on the suspension graph G* is given by 



S* = 



"-"00 

























Clearly, E* is positive definite and satisfies = S*j for all {i,j) E E*. The inverse of the covariance 
matrix S* can be computed by using the inversion formula based on Schur complements: 



1 _ 


+ (<S'oi, ■ • ■ , SQm){^) ^{Sqi, • ■ • , Sq.^)'^ 






^(5*01, . . . , 5'o,„)(ii') 





Since the lower right block equals (S) ^, its entries are indeed zero in all positions {i,j) ^ E*. 
We have shown that the MLE S* is a rational function of the MLE S. This shows 



ML-degree(G*) < ML-degree(G). 

The reverse inequality is also true since we can compute the MLE on G for any S G S^q by computing 
the MLE on G* for its extension S* G S"(f ^ with S'qq = 1 and S^^ = for j G [m]. □ 

We next address the question of how the boundary of the cone Cg* can be expressed in terms of the 
boundary of Cq- We use coordinates ty for both S™ and its subspace M.^ , and we use the coordinates Uij 
for both §™+^ and its subspace . The Schur complement ([30|l defines a rational map from S™"*"^ to 
§™ which restricts to a rational map from to M^. The formula is 

tij = Uij ^^L_2i for 1 < i < j < m. (31) 

Woo 

A partial matrix (uy ) on G* can be completed to a positive definite {m + 1) x (to+ l)-matrix if and only if 
the partial matrix (tij) on G given by this formula can be completed to a positive definite mxm-matrix. 
The rational map ([?T|) takes the boundary of the cone Cq* onto the boundary of the cone Cq- For our 
algebraic question, we can derive the following conclusion: 

Proposition 4.12. The polynomial Hc-'{uij) equals the numerator of the Laurent polynomial obtained 
from HQ(tij) by the substitution iSl]) . and the same holds for each irreducible factor. 

Example 4.13. The polynomial i/vK4 (^oo, uoi, "02, ""03, "04, "ii, "22, U33, M44, ui2, U23, M34, uw) for the 
4-wheel W4 has as its main factor an irreducible polynomial of degree 12 which is the sum of 813 
terms. It is obtained from the homogeneous cycle polynomial A by the substitution ((3T|) . Recall that 
^22, ^33, ^44, ^12, ^23, ^34, ^14) has Only degree 8 and is the sum of 19 terms. □ 
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We briefly discuss an issue raised by Question 1, namely, how to construct the prime ideal Pc* 
from the prime ideal Pa- Again, we can use the transformation (j31[) to turn every generator of Pg 
into a Laurent polynomial whose numerator lies in P^. . However, the resulting polynomials will usually 
not sufBce to generate Pc- This happens already for the 5-cycle G = and the 5- wheel G* = W5. 
The ideal is generated by 15 linearly independent cubics arising as 3x3-minors of the matrices 
^132,1345, ^243,2451, ^354,3512, ^415,4123 and £"521,5234, while Pw^ is generated by 20 hnearly independent 
quartics arising as 4x4-minors of ^0132,01345, £"0243,02451, £0354,03512, £0415,04123 and £0521,05234- Here is 
a table that summarizes what we know about the Gaussian wheels Wm- 



m 


3 


4 


5 


6 


dimension d 


10 


13 


16 


19 


degree 


1 


16 


198 


2264 


ML-degree 


1 


5 


17 


49 


minimal generators (degreemumber) 





4:2 


4:20 


4:108 



5 Colored Gaussian graphical models 

We now add a graph coloring to the setup a nd study colored Gaussian graphical models. These were 
introduced bv lHoisgaard and Lauritzen ( 2008h who called them RCON-models. In the underlying graph 



G, the vertices are colored with p different colors and the edges are colored with q different colors: 

y = T/i u ^2 u • • • u Vp, p<\v\ 

E = EiUE2U---UEg, q< \E\. 

We denote the uncolored graph by G and the colored graph by G- In addition to the restrictions given 
by the missing edges in the graph, the entries of the concentration matrix K are now also restricted by 
equating entries in K according to the edge and vertex colorings. To be precise, the linear space £ of 
associated with a colored graph Q on m = \V\ nodes is defined by the following linear equations: 

— For any pair of nodes a, /3 that do not form an edge we set fc^^ = as before. 

— For any pair of nodes a,P in a common color class Vi we set kaa = kpp. 

— For any pair of edges (a, /?), (7, 5) in a common color class Ej we set kafj — k^s- 

The dimension of the model Q is d — p + q. We note that, for any sample covariance matrix 5, 

ttg{S) e Cq implies T^g{S) E Cg. 

Thus, introducing a graph coloring on G relaxes the question of existence of the MLE. 

In this section we shall examine Questions 1-3 for various colorings G of the 4-cycle G = C4. We begin 
with an illustration of how colored Gaussian graphical models can be used in statistical applications. 




B-i B2 

Fig. 3 Colored Gaussian graphical model for Frets' heads; Li, Bi denote the length and breadth of the head of son i. 
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Exam ple 5.1 (Frets' heads). We revisite the heredity study of head dimensions reported in lMardia et al 



(Il979l) and known to statisticians as Frets ' heads. The data reported in this study consists of the length 
and breadth of the heads of 25 pairs of first and second sons. Because of the symmetry between the two 
sons, it makes sense to try to fit the colored Gaussian graphical model given in Fig. [31 

This model has d = 5 degrees of freedom and it consists of all concentration matrices of the form 



K = 



/Al A3 A4\ 
A3 Al A4 
A4 A2 A5 

VA4 As A2/ 



In Fig. [3l the first random variable is denoted Li, the second L2, the third B2, and the fourth Bi. Given 
a sample covariance matrix S — (sy ), the five sufficient statistics for this model are 

h=Sii+S22, ^2 = 533 + 544, ^3 = 2si2, t4 = 2(S23 + S14), ^5 = 2534. (32) 

The ideal of polynomials vanishing on K^^ is generated by four linear forms and one cubic in the Sij: 
Pg = ( sii - S22 , S33 - S44 , S23 - si4 , si3 - S24 , 



S23^24 



'24 



S22S23S34 + S12S24S34 - S12S23S44 + S22S24S44 , 



Note that the four linear constraints on the sample covariance matrix seen in Pg are al so valid constr aints 
on the concentration matrix. Models wi th this property were studied in general by|jense3 ( 19881 ) and 
appear under the name RCOP-models in lHoisgaard and Lauritzenl ( 2008 ). 

The data reported in the Frets' heads study results in the following sufficient statistics: 

ti = 188.256, t2 = 95.408, ts = 133.750, ti = 210.062, = 67.302. 

Substituting these values into ((32| and solving the equations on V(Pg), we find the MLE for this data: 



^94.1280 66.8750 44.3082 52.5155^ 

66.8750 94.1280 52.5155 44.3082 

44.3082 52.5155 47.7040 33.6510 

V 52.5155 44.3082 33.6510 47.7040/ 



Both the degree and the ML-degree of this colored Gaussian graphical model is 3, which answers 
Questions 1 and 2. It remains to describe the boundary of the cone Cg and to determine its defining 
polynomial Hg. The variety of rank one concentration matrices has four irreducible components: 

{k2,k4,h,ki + ks), (k2,k4,k5,ki - ks), (fci, fc3, A:4, fc2 + fcg), (/ci, A:3, /c4, ^2 - ^5). 

These are points in P'^ and the ideals of their dual hyperplanes are (ti — t^) , {ti + ts), (^2 — ^5), (^2 + ^5) • 
The variety of rank two concentration matrices is irreducible. Its prime ideal and the dual thereof are 



kik2 - kl + ksk^, kskl 



kik^ 



{it^tl - Uit2tl + t\ + Uit2hh - ^htlh + 4t2t2) 



(fc2A:3 + kika, 

24-2 
^^3 

This suggests that the hypersurface dCg is given by the polynomial 

Hg = {h - h){ti + h){t2 - h){t2 + h){Atltl ~ Aht2tl + 1| + Uit2hh - ^htlh + Atltl). (33) 

Using CVX as in Example 14.101 we checked that all five factors meet dCg, so ([SS]) is indeed correct. □ 

We performed a similar analysis for all colored Gaussian graphical models on the 4-cycle C4, which 
have the property that edges in the same color class connect the same vertex color classes. The re- 
sults are presented in Table [H [3] and U) These models are of special interest because they are invariant 
un der rescaling of variables in the same vertex color class. Such models were introduced and studied 
bv lH0isgaard and Lauritze 3 dlooi). For models with an additional permutation property (these are the 
RCOP-models), we explicitly list the polynomial Hg. A census of these models appears in Table SI 
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Table 2 Results on Questions 1, 2, and 3 for all colored Gaussian graphical models with some symmetry restrictions 
(namely, edges in the same color class connect the same vertex color classes) on the 4-cycle. 



Graph 



K 



dim d 



degree 



mingens Pg 



ML-degree 



deg He 





A2 





A2\ 


A2 


Ai 


As 








^S 


Ai 


Ao 
^2 







A2 


Al/ 




As 





As\ 


As 


A2 


A4 










Ai 




Us 





As 


A2/ 




A2 





A2\ 




Ai 


As 





Q 


/VS 


Ai 


Aq 
AS 


\ ^ 





As 


Al/ 


/Ai 


As 





As\ 


As 


Ai 


A4 










At 


A/i 
^4 


VAs 





A4 


Al/ 




As 





As\ 


As 


A2 


A4 








/»4 


Ai 


Al 
^4 







A4 


A2/ 


Ai 


A2 





A2\ 


A2 


Ai 


As 





Q 


/'S 


Ai 


A^ 

/\4 







A4 


Al/ 


Ai 


As 





As\ 


As 


Ai 


A4 





Q 


A,i 


Ao 
A2 


Ac 
A5 


Us 





A5 


Al/ 


Ai 


As 





As\ 


As 


A2 


A4 








A^ 

A4 


Ai 


Ar 
^S 


Us 





A5 


A2/ 


Ai 


A4 





A4\ 


A4 


A2 


As 








Ac 


Aq 

^s 


Af 


U4 





As 


A2/ 


( 


A2 





As\ 


A2 


Ai 


As 





Q 


/vS 


Ai 


A/1 

/\4 


Us 





A4 


Al/ 


Ai 


As 





A4\ 


As 


A2 


A4 





n 
U 


A4 


\ 1 
Al 


As 


U4 





As 


A2/ 


Ai 


A2 





As\ 


A2 


Ai 


As 








As 


Al 


A4 


Us 





A4 


Aiy 


Ai 


As 





A6\ 


As 


Ai 


A4 








A4 


A2 


As 


V^6 





As 


A2/ 



11 



11 



13 



21 



15 



11 



11 



17 



1:4, 2:5 



1:1, 2:10 



1:4, 2:1 



1:3, 2:4 



1:3, 2:2, 3:4 



1:1, 2:10, 3:1 



2:8, 3:3 



2:5, 3:10 



2:5, 3:1 



1:4, 2:1, 3:2 



1:1, 2:5, 3:4 



1:1, 2:5, 3:4 



2:3, 3:4 



22 + 23 + 23 



22 + 4s 



23 



22 + 43 



23 



22 + 22 + 43 



42 + 4s 



22 + 22 + 43 



22+32 



I2 + I2 + 22 



22 + 22 



22+22 



ll +I1 +I1 +I1 +42+42 
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Table 3 Continuation of Table [2] 







/Ai A3 AeX 
A3 A2 A4 
A4 Ai A5 

\Xe As A2/ 












6 


21 














/Ai A3 AeX 
A3 Ai A4 
A4 Ai A5 

\X6 As A2/ 






r 






6 


17 
















/Ai A4 AtX 
A4 Ai As 
As A2 Ae 

\\7 Ae A3/ 


7 


13 












» 




/Ai A4 AtX 
A4 A2 As 
As Ai Ae 

\Xt Ae A3y 


7 


17 














/Ai As AgX 
As A2 Ae 
Ae A3 A7 

\Xs At A4/ 






1 ! 




8 


9 











3:10, 4:12 



2:2, 3:8, 4:1 



2:1, 3:3 



3:3, 4:6 



3:2 



22+22 



102 



li + li+2i+122 



42 



2i +2i +2i +2i +82 



Table 4 All RCOP-models llH0isgaard and Lauritzenll2008f) when the underlying graph is the 4-cycle. 
Graph K dim d degree mingens Pg ML-degree Hjz 



r 

Ll 



/Ai A2 
A2 Ai 
A2 
\X2 

/Ai A3 
A3 A2 
A3 
\X-i 

/Ai A2 
A2 Ai 
A3 
VA3 

/Ai A3 
A3 A2 
A4 
VA4 

/Ai A4 
A4 A2 
As 
VA4 

/Ai A3 
A3 Ai 
A4 

\X4 



A2\ 
A2 
Ai A2 
A2 Ai J 

A3\ 
A3 
Ai A3 

A3 A2y 

A3\ 
A3 
Ai A2 
A2 Ai J 

A4\ 
A4 
Ai A3 

A3 A2y 

A4\ 


As 

A2y 

A4\ 



A2 As 

As A2y 



1:7, 2:1 



1:5, 2:2 



1:6, 3:1 



1:4, 2:1, 3:2 



1:3, 2:1, 3:1 



1:4, 3:1 



(2il -t2){2il +t2) 



16iit2 - tg 



(ti - t2Kti + t2){ii - i3)(ii + ta) 



(4*1*2 -*§) (4*1*2 -*!) 



(8*1*2 -*1)(8*2*3 -*i) 



l|33] l in Example [gH 



Example 5.2. We can gain a different perspective on tfie proof of Lemma [4.71 by considering colored 
Gaussian graphical models. Under the assumption (j26p that all parameters in the partial matrix (|25p are 
equal to some fixed value x, the MLE K for the concentration matrix has the same structure. Namely, all 
diagonal entries of K are equal, and all non-zero off-diagonal entries of K are equal. This means that we 
can perform our MLE computation for the colored Gaussian graphical model with the chordless m-cycle 
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as underlying graph, where aU vertices and aU edges have the same color: 

/Ai A2 • • • A2\ 
A2 Ai A2 • • • 
A2 Ai A2 • • • 



A2 Ai A2 
VA2 A2 Ai/ 

In contrast to the approach in the proof of Lemma 14. 7[ in this representation we only need to solve a 
system of two polynomial equations in two unknowns, regardless of the cycle size m. The equations are 

(A'-i)ii = 1 and {K-^)i2 = x. 

By clearing denominators we obtain two polynomial equations in the unknowns Ai and A2. We need to 
express these in terms of the parameter x, but there are many extraneous solutions. The ML degree is 
algebraic degree of the special solution (Ai(a;), A2(x)) which makes positive definite. □ 




Fig. 4 The cross section of the cone of sufficient statistics in Example 15.31 is the red convex body shown in the left figure. 
It is dual to Cayley's cubic surface, which is shown in yellow in the right figure and also in Fig.[T]on the left. 

Example 5.3. Let G be the colored triangle with the same color for all three vertices and three distinct 
colors for the edges. This is an RCOP model with m = 3 and d = 4. The corresponding subspace C of 
consists of all concentration matrices 

(A4 Ai A2\ 
Ai A4 A3 . 
A2 A3 X4J 
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This linear space £ is generic enough so as to exhibit the geometric behavior described in Subsection 2.2. 
The four-dimensional cone K,c is the cone over the 3-dimensional spectrahedron bounded by Cayley's 
cubic surface as shown on the right in Fig. 3) Its dual Cc is the cone over the 3-dimensional convex body 
shown on the left in Fig. [H The boundary of this convex body consists of four flat 2-dimensional circular 
faces (shown in black) and four curved surfaces whose common Zariski closure is a quartic Steiner surface. 
Fig. IHwas made with surf e^Jf], a software package for visualizing algebraic surfaces. 

Here, the inequalities ([14]) state 2 < p < 3, and the algebraic degree of SDP is (5(3, 3, 2) = (5(3, 3, 1) = 4. 
We find that He is a polynomial of degree 8 which factors into four linear forms and one quartic: 

He = {h -t2 + h- u){ti +t2-h- u){ti -t2-h + u){h +t2+h + U){tl4 + 44 + 44 - 2tit2hu) 

By Theorem 12.31 both the degree and the ML degree of this model are also equal to (/)(3, 4) = 4. □ 

Acknowledgements We wish to thank Seth SuUivant, Bernd Ulrich and Ruriko Yoshida for helpful comments. 
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